Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 1000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 78.2 KiB |
| Average record size in memory | 80.1 B |
Variable types
| Categorical | 2 |
|---|---|
| Numeric | 8 |
O3 has constant value "0.0" | Constant |
Time has a high cardinality: 999 distinct values | High cardinality |
Time is uniformly distributed | Uniform |
PM25 has 903 (90.3%) zeros | Zeros |
VOC has 104 (10.4%) zeros | Zeros |
Reproduction
| Analysis started | 2021-04-01 11:03:50.300361 |
|---|---|
| Analysis finished | 2021-04-01 11:04:04.390771 |
| Duration | 14.09 seconds |
| Software version | pandas-profiling v2.12.0 |
| Download configuration | config.yaml |
| Distinct | 999 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.9 KiB |
| 09:34:53 | 2 |
|---|---|
| 10:45:52 | 1 |
| 09:20:55 | 1 |
| 14:15:03 | 1 |
| 11:04:03 | 1 |
| Other values (994) |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Characters and Unicode
| Total characters | 8000 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 998 ? |
|---|---|
| Unique (%) | 99.8% |
Sample
| 1st row | 16:33:01 |
|---|---|
| 2nd row | 16:32:09 |
| 3rd row | 16:31:16 |
| 4th row | 16:30:21 |
| 5th row | 16:29:29 |
| Value | Count | Frequency (%) |
| 09:34:53 | 2 | 0.2% |
| 10:45:52 | 1 | 0.1% |
| 09:20:55 | 1 | 0.1% |
| 14:15:03 | 1 | 0.1% |
| 11:04:03 | 1 | 0.1% |
| 10:42:45 | 1 | 0.1% |
| 11:10:08 | 1 | 0.1% |
| 10:07:37 | 1 | 0.1% |
| 11:35:32 | 1 | 0.1% |
| 16:24:15 | 1 | 0.1% |
| Other values (989) | 989 |
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 09:34:53 | 2 | 0.2% |
| 10:45:52 | 1 | 0.1% |
| 09:20:55 | 1 | 0.1% |
| 14:15:03 | 1 | 0.1% |
| 11:04:03 | 1 | 0.1% |
| 10:42:45 | 1 | 0.1% |
| 11:10:08 | 1 | 0.1% |
| 10:07:37 | 1 | 0.1% |
| 11:35:32 | 1 | 0.1% |
| 16:24:15 | 1 | 0.1% |
| Other values (989) | 989 |
Most occurring characters
| Value | Count | Frequency (%) |
| : | 2000 | |
| 1 | 1390 | |
| 0 | 900 | |
| 5 | 672 | 8.4% |
| 2 | 619 | 7.7% |
| 4 | 612 | 7.6% |
| 3 | 602 | 7.5% |
| 6 | 317 | 4.0% |
| 7 | 300 | 3.8% |
| 8 | 295 | 3.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6000 | |
| Other Punctuation | 2000 | 25.0% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 1390 | |
| 0 | 900 | |
| 5 | 672 | |
| 2 | 619 | |
| 4 | 612 | |
| 3 | 602 | |
| 6 | 317 | 5.3% |
| 7 | 300 | 5.0% |
| 8 | 295 | 4.9% |
| 9 | 293 | 4.9% |
| Value | Count | Frequency (%) |
| : | 2000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8000 |
Most frequent character per script
| Value | Count | Frequency (%) |
| : | 2000 | |
| 1 | 1390 | |
| 0 | 900 | |
| 5 | 672 | 8.4% |
| 2 | 619 | 7.7% |
| 4 | 612 | 7.6% |
| 3 | 602 | 7.5% |
| 6 | 317 | 4.0% |
| 7 | 300 | 3.8% |
| 8 | 295 | 3.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8000 |
Most frequent character per block
| Value | Count | Frequency (%) |
| : | 2000 | |
| 1 | 1390 | |
| 0 | 900 | |
| 5 | 672 | 8.4% |
| 2 | 619 | 7.7% |
| 4 | 612 | 7.6% |
| 3 | 602 | 7.5% |
| 6 | 317 | 4.0% |
| 7 | 300 | 3.8% |
| 8 | 295 | 3.7% |
Temp
Real number (ℝ≥0)
| Distinct | 439 |
|---|---|
| Distinct (%) | 43.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 36.89512 |
| Minimum | 29.31 |
|---|---|
| Maximum | 42.94 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 29.31 |
|---|---|
| 5-th percentile | 33.7895 |
| Q1 | 34.66 |
| median | 35.94 |
| Q3 | 39.67 |
| 95-th percentile | 40.261 |
| Maximum | 42.94 |
| Range | 13.63 |
| Interquartile range (IQR) | 5.01 |
Descriptive statistics
| Standard deviation | 2.506371258 |
|---|---|
| Coefficient of variation (CV) | 0.06793232433 |
| Kurtosis | -1.313489231 |
| Mean | 36.89512 |
| Median Absolute Deviation (MAD) | 2.015 |
| Skewness | 0.06617196177 |
| Sum | 36895.12 |
| Variance | 6.281896882 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 39.69 | 17 | 1.7% |
| 39.67 | 16 | 1.6% |
| 39.74 | 13 | 1.3% |
| 39.76 | 11 | 1.1% |
| 39.7 | 11 | 1.1% |
| 39.77 | 11 | 1.1% |
| 39.72 | 10 | 1.0% |
| 34.31 | 9 | 0.9% |
| 39.65 | 9 | 0.9% |
| 39.6 | 9 | 0.9% |
| Other values (429) | 884 |
| Value | Count | Frequency (%) |
| 29.31 | 1 | |
| 29.73 | 1 | |
| 30.28 | 1 | |
| 30.61 | 1 | |
| 31.32 | 1 |
| Value | Count | Frequency (%) |
| 42.94 | 1 | |
| 42.89 | 1 | |
| 42.86 | 1 | |
| 42.4 | 1 | |
| 41.84 | 1 |
| Distinct | 17 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.01845 |
| Minimum | 0 |
|---|---|
| Maximum | 0.22 |
| Zeros | 903 |
| Zeros (%) | 90.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0.21 |
| Maximum | 0.22 |
| Range | 0.22 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.05833438437 |
|---|---|
| Coefficient of variation (CV) | 3.161755251 |
| Kurtosis | 6.572066325 |
| Mean | 0.01845 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.910259912 |
| Sum | 18.45 |
| Variance | 0.0034029004 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=17)
| Value | Count | Frequency (%) |
| 0 | 903 | |
| 0.21 | 62 | 6.2% |
| 0.2 | 10 | 1.0% |
| 0.22 | 6 | 0.6% |
| 0.19 | 4 | 0.4% |
| 0.07 | 2 | 0.2% |
| 0.17 | 2 | 0.2% |
| 0.02 | 2 | 0.2% |
| 0.01 | 1 | 0.1% |
| 0.16 | 1 | 0.1% |
| Other values (7) | 7 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 903 | |
| 0.01 | 1 | 0.1% |
| 0.02 | 2 | 0.2% |
| 0.03 | 1 | 0.1% |
| 0.05 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0.22 | 6 | 0.6% |
| 0.21 | 62 | |
| 0.2 | 10 | 1.0% |
| 0.19 | 4 | 0.4% |
| 0.18 | 1 | 0.1% |
lux
Real number (ℝ≥0)
| Distinct | 330 |
|---|---|
| Distinct (%) | 33.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 63.61273 |
| Minimum | 6.76 |
|---|---|
| Maximum | 1859.2 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 6.76 |
|---|---|
| 5-th percentile | 10.78 |
| Q1 | 16.0175 |
| median | 81.28 |
| Q3 | 84.39 |
| 95-th percentile | 92.04 |
| Maximum | 1859.2 |
| Range | 1852.44 |
| Interquartile range (IQR) | 68.3725 |
Descriptive statistics
| Standard deviation | 116.6429709 |
|---|---|
| Coefficient of variation (CV) | 1.833641959 |
| Kurtosis | 205.285406 |
| Mean | 63.61273 |
| Median Absolute Deviation (MAD) | 11.32 |
| Skewness | 13.73549663 |
| Sum | 63612.73 |
| Variance | 13605.58266 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 10.86 | 57 | 5.7% |
| 10.94 | 41 | 4.1% |
| 10.78 | 39 | 3.9% |
| 84.36 | 21 | 2.1% |
| 85.16 | 18 | 1.8% |
| 84.52 | 16 | 1.6% |
| 81.44 | 15 | 1.5% |
| 85.32 | 15 | 1.5% |
| 81.6 | 14 | 1.4% |
| 82.88 | 14 | 1.4% |
| Other values (320) | 750 |
| Value | Count | Frequency (%) |
| 6.76 | 1 | 0.1% |
| 6.92 | 3 | |
| 7 | 1 | 0.1% |
| 7.16 | 1 | 0.1% |
| 7.24 | 4 |
| Value | Count | Frequency (%) |
| 1859.2 | 1 | |
| 1820.16 | 1 | |
| 1805.44 | 1 | |
| 1803.52 | 1 | |
| 259.6 | 1 |
| Distinct | 466 |
|---|---|
| Distinct (%) | 46.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2580.952 |
| Minimum | 0 |
|---|---|
| Maximum | 60000 |
| Zeros | 104 |
| Zeros (%) | 10.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 27.75 |
| median | 170.5 |
| Q3 | 337.75 |
| 95-th percentile | 10811.2 |
| Maximum | 60000 |
| Range | 60000 |
| Interquartile range (IQR) | 310 |
Descriptive statistics
| Standard deviation | 10076.37519 |
|---|---|
| Coefficient of variation (CV) | 3.904131185 |
| Kurtosis | 25.63271084 |
| Mean | 2580.952 |
| Median Absolute Deviation (MAD) | 146 |
| Skewness | 5.102247517 |
| Sum | 2580952 |
| Variance | 101533337 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 104 | 10.4% |
| 60000 | 26 | 2.6% |
| 12 | 11 | 1.1% |
| 9 | 11 | 1.1% |
| 32 | 10 | 1.0% |
| 234 | 8 | 0.8% |
| 8 | 8 | 0.8% |
| 22 | 7 | 0.7% |
| 241 | 7 | 0.7% |
| 25 | 7 | 0.7% |
| Other values (456) | 801 |
| Value | Count | Frequency (%) |
| 0 | 104 | |
| 1 | 6 | 0.6% |
| 2 | 3 | 0.3% |
| 3 | 4 | 0.4% |
| 4 | 4 | 0.4% |
| Value | Count | Frequency (%) |
| 60000 | 26 | |
| 49264 | 1 | 0.1% |
| 44454 | 1 | 0.1% |
| 32637 | 1 | 0.1% |
| 32044 | 1 | 0.1% |
CO
Real number (ℝ≥0)
| Distinct | 273 |
|---|---|
| Distinct (%) | 27.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.88559 |
| Minimum | 3.03 |
|---|---|
| Maximum | 61.49 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 3.03 |
|---|---|
| 5-th percentile | 3.38 |
| Q1 | 3.54 |
| median | 3.69 |
| Q3 | 3.9925 |
| 95-th percentile | 42.7715 |
| Maximum | 61.49 |
| Range | 58.46 |
| Interquartile range (IQR) | 0.4525 |
Descriptive statistics
| Standard deviation | 11.85801595 |
|---|---|
| Coefficient of variation (CV) | 1.503757607 |
| Kurtosis | 6.630497451 |
| Mean | 7.88559 |
| Median Absolute Deviation (MAD) | 0.18 |
| Skewness | 2.833060716 |
| Sum | 7885.59 |
| Variance | 140.6125422 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3.57 | 61 | 6.1% |
| 3.69 | 51 | 5.1% |
| 3.7 | 30 | 3.0% |
| 3.38 | 29 | 2.9% |
| 3.39 | 24 | 2.4% |
| 3.72 | 23 | 2.3% |
| 3.56 | 23 | 2.3% |
| 3.52 | 23 | 2.3% |
| 3.47 | 23 | 2.3% |
| 3.68 | 22 | 2.2% |
| Other values (263) | 691 |
| Value | Count | Frequency (%) |
| 3.03 | 2 | 0.2% |
| 3.33 | 10 | |
| 3.34 | 12 | |
| 3.35 | 4 | 0.4% |
| 3.36 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 61.49 | 1 | |
| 55.45 | 1 | |
| 54.31 | 1 | |
| 53.89 | 1 | |
| 53.85 | 1 |
CO2
Real number (ℝ≥0)
| Distinct | 259 |
|---|---|
| Distinct (%) | 25.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1847.559 |
| Minimum | 400 |
|---|---|
| Maximum | 57330 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 400 |
|---|---|
| 5-th percentile | 400 |
| Q1 | 400 |
| median | 400 |
| Q3 | 530 |
| 95-th percentile | 1149.15 |
| Maximum | 57330 |
| Range | 56930 |
| Interquartile range (IQR) | 130 |
Descriptive statistics
| Standard deviation | 8522.870237 |
|---|---|
| Coefficient of variation (CV) | 4.613043609 |
| Kurtosis | 38.21506597 |
| Mean | 1847.559 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.319841758 |
| Sum | 1847559 |
| Variance | 72639317.07 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 400 | 622 | |
| 57330 | 21 | 2.1% |
| 467 | 5 | 0.5% |
| 435 | 4 | 0.4% |
| 484 | 4 | 0.4% |
| 403 | 4 | 0.4% |
| 536 | 4 | 0.4% |
| 431 | 4 | 0.4% |
| 509 | 3 | 0.3% |
| 530 | 3 | 0.3% |
| Other values (249) | 326 |
| Value | Count | Frequency (%) |
| 400 | 622 | |
| 401 | 1 | 0.1% |
| 403 | 4 | 0.4% |
| 404 | 1 | 0.1% |
| 405 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 57330 | 21 | |
| 56554 | 1 | 0.1% |
| 55063 | 1 | 0.1% |
| 13682 | 1 | 0.1% |
| 9773 | 1 | 0.1% |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.9 KiB |
| 0.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 3000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
| Value | Count | Frequency (%) |
| 0.0 | 1000 |
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 0.0 | 1000 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2000 | |
| . | 1000 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2000 | |
| Other Punctuation | 1000 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 2000 |
| Value | Count | Frequency (%) |
| . | 1000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3000 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 2000 | |
| . | 1000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3000 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 2000 | |
| . | 1000 |
RH
Real number (ℝ≥0)
| Distinct | 717 |
|---|---|
| Distinct (%) | 71.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.07621 |
| Minimum | 14.21 |
|---|---|
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 14.21 |
|---|---|
| 5-th percentile | 15.5895 |
| Q1 | 19.03 |
| median | 24.075 |
| Q3 | 27.225 |
| 95-th percentile | 56.3045 |
| Maximum | 100 |
| Range | 85.79 |
| Interquartile range (IQR) | 8.195 |
Descriptive statistics
| Standard deviation | 12.77487643 |
|---|---|
| Coefficient of variation (CV) | 0.489905413 |
| Kurtosis | 10.11149938 |
| Mean | 26.07621 |
| Median Absolute Deviation (MAD) | 4.305 |
| Skewness | 2.932377073 |
| Sum | 26076.21 |
| Variance | 163.1974678 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 19.17 | 8 | 0.8% |
| 19.25 | 7 | 0.7% |
| 24.74 | 7 | 0.7% |
| 19.12 | 5 | 0.5% |
| 24.23 | 5 | 0.5% |
| 14.26 | 4 | 0.4% |
| 25.06 | 4 | 0.4% |
| 24.9 | 4 | 0.4% |
| 19.78 | 4 | 0.4% |
| 23.81 | 4 | 0.4% |
| Other values (707) | 948 |
| Value | Count | Frequency (%) |
| 14.21 | 1 | 0.1% |
| 14.24 | 2 | |
| 14.25 | 1 | 0.1% |
| 14.26 | 4 | |
| 14.36 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 100 | 2 | |
| 98.8 | 1 | |
| 97.94 | 1 | |
| 97.81 | 1 | |
| 95.44 | 1 |
Pres
Real number (ℝ≥0)
| Distinct | 380 |
|---|---|
| Distinct (%) | 38.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 978.97492 |
| Minimum | 976.58 |
|---|---|
| Maximum | 981.11 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 976.58 |
|---|---|
| 5-th percentile | 977.0995 |
| Q1 | 977.91 |
| median | 979.015 |
| Q3 | 979.98 |
| 95-th percentile | 980.95 |
| Maximum | 981.11 |
| Range | 4.53 |
| Interquartile range (IQR) | 2.07 |
Descriptive statistics
| Standard deviation | 1.212822135 |
|---|---|
| Coefficient of variation (CV) | 0.001238869465 |
| Kurtosis | -1.075901529 |
| Mean | 978.97492 |
| Median Absolute Deviation (MAD) | 0.995 |
| Skewness | 0.04759768411 |
| Sum | 978974.92 |
| Variance | 1.470937531 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 980 | 10 | 1.0% |
| 979.28 | 9 | 0.9% |
| 979.99 | 9 | 0.9% |
| 979.31 | 8 | 0.8% |
| 977.77 | 7 | 0.7% |
| 980.94 | 7 | 0.7% |
| 977.7 | 7 | 0.7% |
| 979.32 | 7 | 0.7% |
| 979.98 | 7 | 0.7% |
| 977.66 | 7 | 0.7% |
| Other values (370) | 922 |
| Value | Count | Frequency (%) |
| 976.58 | 2 | |
| 976.59 | 1 | |
| 976.6 | 1 | |
| 976.61 | 1 | |
| 976.62 | 1 |
| Value | Count | Frequency (%) |
| 981.11 | 1 | |
| 981.1 | 1 | |
| 981.09 | 1 | |
| 981.08 | 1 | |
| 981.07 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| Time | Temp | PM25 | lux | VOC | CO | CO2 | O3 | RH | Pres | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 16:33:01 | 35.89 | 0.0 | 81.76 | 7.0 | 3.47 | 400.0 | 0.0 | 24.08 | 978.20 |
| 1 | 16:32:09 | 35.88 | 0.0 | 81.90 | 1.0 | 3.44 | 400.0 | 0.0 | 24.07 | 978.22 |
| 2 | 16:31:16 | 35.96 | 0.0 | 80.94 | 24.0 | 3.47 | 400.0 | 0.0 | 24.04 | 978.30 |
| 3 | 16:30:21 | 35.91 | 0.0 | 81.60 | 19.0 | 3.47 | 400.0 | 0.0 | 24.25 | 978.34 |
| 4 | 16:29:29 | 35.88 | 0.0 | 81.44 | 0.0 | 3.47 | 400.0 | 0.0 | 24.02 | 978.28 |
| 5 | 16:28:38 | 35.94 | 0.0 | 81.12 | 27.0 | 3.47 | 400.0 | 0.0 | 23.90 | 978.30 |
| 6 | 16:26:54 | 35.91 | 0.0 | 81.12 | 6.0 | 3.47 | 400.0 | 0.0 | 23.85 | 978.23 |
| 7 | 16:25:09 | 36.00 | 0.0 | 81.12 | 19.0 | 3.48 | 400.0 | 0.0 | 23.73 | 978.27 |
| 8 | 16:24:15 | 36.02 | 0.0 | 80.48 | 33.0 | 3.48 | 400.0 | 0.0 | 23.82 | 978.30 |
| 9 | 16:23:23 | 36.02 | 0.0 | 80.32 | 36.0 | 3.48 | 400.0 | 0.0 | 23.81 | 978.28 |
Last rows
| Time | Temp | PM25 | lux | VOC | CO | CO2 | O3 | RH | Pres | |
|---|---|---|---|---|---|---|---|---|---|---|
| 990 | 05:57:19 | 39.84 | 0.0 | 10.78 | 35.0 | 3.71 | 400.0 | 0.0 | 21.20 | 976.67 |
| 991 | 05:56:28 | 39.85 | 0.0 | 10.86 | 32.0 | 3.70 | 400.0 | 0.0 | 21.22 | 976.67 |
| 992 | 05:55:36 | 39.84 | 0.0 | 10.78 | 43.0 | 3.69 | 400.0 | 0.0 | 21.29 | 976.59 |
| 993 | 05:54:45 | 39.82 | 0.0 | 10.78 | 31.0 | 3.69 | 400.0 | 0.0 | 21.30 | 976.60 |
| 994 | 05:53:55 | 39.82 | 0.0 | 10.86 | 21.0 | 3.69 | 400.0 | 0.0 | 21.29 | 976.63 |
| 995 | 05:52:08 | 39.82 | 0.0 | 10.78 | 31.0 | 3.69 | 400.0 | 0.0 | 21.36 | 976.69 |
| 996 | 05:51:17 | 39.80 | 0.0 | 10.86 | 27.0 | 3.69 | 400.0 | 0.0 | 21.37 | 976.65 |
| 997 | 05:50:25 | 39.81 | 0.0 | 10.86 | 19.0 | 3.69 | 400.0 | 0.0 | 21.36 | 976.61 |
| 998 | 05:49:35 | 39.81 | 0.0 | 10.86 | 22.0 | 3.69 | 400.0 | 0.0 | 21.37 | 976.58 |
| 999 | 05:48:43 | 39.82 | 0.0 | 10.78 | 15.0 | 3.69 | 400.0 | 0.0 | 21.38 | 976.58 |